Goto

Collaborating Authors

 nonlinear map



Spherical Random Features for Polynomial Kernels

Neural Information Processing Systems

Compact explicit feature maps provide a practical framework to scale kernel methods to large-scale learning, but deriving such maps for many types of kernels remains a challenging open problem. Among the commonly used kernels for nonlinear classification are polynomial kernels, for which low approximation error has thus far necessitated explicit feature maps of large dimensionality, especially for higher-order polynomials.


Null Space Properties of Neural Networks with Applications to Image Steganography

arXiv.org Artificial Intelligence

Neural networks are powerful learning methods in use for various tasks today. This is especially true in the domain of image recognition, where neural networks can achieve even human-competitive results[13]. However, a number of studies have revealed that neural networks for image classification can be easily influenced to misclassify by modifying images[1]. In 2014, Szegedy et al. first discovered an intriguing weakness of deep neural networks[15]. They showed that neural networks for image classification can be easily fooled by small perturbations, and they called these intentionally modified images adversarial examples. Following this observation, numerous studies have been carried out to find different ways to generate adversarial examples[7, 11, 13]. The main idea is to find a subtle perturbation that can drastically change the output of a neural network by adding it to the data. It is observed that adversarial examples have good transferability across models, which suggests that the existence of adversarial examples is also a property of datasets[8], thus adversarial examples are not restricted only to the given model. In our study, we aim to find a model-based method to fool the neural networks.


Spherical Random Features for Polynomial Kernels

Neural Information Processing Systems

Compact explicit feature maps provide a practical framework to scale kernel methods to large-scale learning, but deriving such maps for many types of kernels remains a challenging open problem. Among the commonly used kernels for nonlinear classification are polynomial kernels, for which low approximation error has thus far necessitated explicit feature maps of large dimensionality, especially for higher-order polynomials. Meanwhile, because polynomial kernels are unbounded, they are frequently applied to data that has been normalized to unit l2 norm. The question we address in this work is: if we know a priori that data is so normalized, can we devise a more compact map? We show that a putative affirmative answer to this question based on Random Fourier Features is impossible in this setting, and introduce a new approximation paradigm, Spherical Random Fourier (SRF) features, which circumvents these issues and delivers a compact approximation to polynomial kernels for data on the unit sphere. Compared to prior work, SRF features are less rank-deficient, more compact, and achieve better kernel approximation, especially for higher-order polynomials. The resulting predictions have lower variance and typically yield better classification accuracy.


Compact Nonlinear Maps and Circulant Extensions

arXiv.org Machine Learning

Kernel approximation via nonlinear random feature maps is widely used in speeding up kernel machines. There are two main challenges for the conventional kernel approximation methods. First, before performing kernel approximation, a good kernel has to be chosen. Picking a good kernel is a very challenging problem in itself. Second, high-dimensional maps are often required in order to achieve good performance. This leads to high computational cost in both generating the nonlinear maps, and in the subsequent learning and prediction process. In this work, we propose to optimize the nonlinear maps directly with respect to the classification objective in a data-dependent fashion. The proposed approach achieves kernel approximation and kernel learning in a joint framework. This leads to much more compact maps without hurting the performance. As a by-product, the same framework can also be used to achieve more compact kernel maps to approximate a known kernel. We also introduce Circulant Nonlinear Maps, which uses a circulant-structured projection matrix to speed up the nonlinear maps for high-dimensional data.